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i Work'l&y  A.  V.  OppenheiTnand  nis  students  and  collaboratolrs^is  summarized 
in  the  following  projects:  Homomorphic  speech  analysis-synthesis,  enhance- 
ment of  degraded  speech,  time-varying  linear  predictive  coding  of  speech  signals, 
and  digital  seismic  signal  processing.  I 
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1.  HOMOMORPHIC  SPEECH  ANALYSIS-SYNTHESIS 


U.  S.  Navy  — Office  of  Naval  Research  (Contract  N00014-75-C-0951) 
Thomas  F.  Quatieri,  Jr. , Alan  V.  Oppenheim,  Antonio  Ruiz 


a.  Simulation  of  a Homomorphic  Vocoder  Based  on  Charged  Coupled 

Device  (CCD)  Technology 

We  have  completed  simulations  of  various  speech  analysis-synthesis  configiirations 
based  on  both  the  conventional  chirp -z -transform  (CZT)  realization  of  the  discrete 
Fourier  transform  and  the  sliding  CZT  realization  of  the  discrete  sliding  Fourier  trans- 
form. These  realizations  are  amenable  to  CCD  technology  and  allow  for  real-time,  low- 
cost  implementation  of  the  homomorphic  vocoder. 

A comparative  study  was  performed,  illustrating  the  tradeoffs  between  synthetic 
speech  quality  and  implementational  complexity  for  the  two  schemes. 

b.  Quality  Improvement 

Techniques  for  synthetic  speech  quality  improvement  were  tested  and  evaluated  in 
collaboration  with  studies  of  the  Speech  Group  at  Lincoln  Laboratory.  Issues  such  as 
interpolation,  amplitude  measurements,  buzziness,  hoarseness,  and  coding  were  ex- 
plored. For  male  speech,  formal  listening  tests  indicate  that  for  low  bit  rates  (2400  to 
3600  bps)  the  homomorphic  system  is  comparable  in  quality  to  more  established  schemes 
such  as  LPC  and  the  channel  vocoder.  For  high  bit  rates  (8000  to  9600  bps),  the  homo- 
morphic system  was  judged  to  have  the  highest  quality. 

The  female  synthetic  speech,  on  the  other  hand,  unlike  that  of  the  male,  tends  in 
general  to  be  degraded  by  a "hoarseness."  We  are  investigating  ways  of  rigorously 
characterizing  this  degradation  and  are  exploring  adaptive  techniques  for  improvement. 
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We  shall  soon  be  completing  a pitch- synchronous  complex  cepstral  vocoder.  This 
scheme,  we  hope,  will  yield  an  luiderstanding  of  the  effects  of  phase  of  the  estimated 
vocal  tract  impulse  response  on  synthetic  speech  quality. 


2.  ENHANCEMENT  OF  DEGRADED  SPEECH 

U.  S.  Navy  — Office  of  Naval  Research  (Contract  N00014-75-C-0951) 
Jae  S.  Lim,  Alan  V.  Oppenheim 


Continuing  our  work  on  enhancing  degraded  speech,  we  have  attempted  to  develop  a 
complete  analysis/ synthesis  system  in  which  the  synthesis  parameters  are  estimated 
from  noisy  speech  data.  The  particular  analysis/synthesis  system  we  have  considered 
is  based  on  an  all-pole  model  of  speech.  Our  approach  has  been  to  apply  a Maximum 
A Posteriori  (MAP)  estimation  procedure  in  estimating  the  coefficient  vector  of  an  all- 
pole system  from  noisy  speech  accoimting  for  the  presence  of  noise.  In  general,  a MAP 
estimation  procedure  for  noisy  speech  leads  to  solving  a set  of  nonlinear  equations.  Two 
suboptimal  procedures  which  require  solving  only  sets  of  linear  equations  have,  how- 
ever, been  developed.  These  methods  have  been  applied  to  both  synthetic  and  real 
speech  data  with  white  Gaussian  background  noise,  and  our  preliminary  listening  test 
indicates  that  both  systems  are  capable  of  significant  noise  reduction.  We  are  now 
engaged  in  a more  formal  subjective  test  which  is  directed  toward  evaluating  the  two 
linear  systems  in  terms  of  their  performance  in  enhancing  speech  intelligibility  and 
quality  when  the  backgroTind  noise  is  of  various  different  spectra. 


3,  TIME-VARYING  LINEAR  PREDICTIVE  CODING  OF  SPEECH  SIGNALS 

U.  S.  Navy  — Office  of  Naval  Research  (Contract  N00014-75-C-0951) 

Mark  G.  Hall,  Alan  V.  Oppenheim,  Alan  S.  Willsky 

[Prof.  Willsky  is  Assistant  Director  of  the  Electronic  Systems  Laboratory,  M.  I.  T.] 

During  this  year  we  have  completed  a project  involved  with  time-varying  predic- 
tive coding  of  speech.  The  project  involved  a generalization  of  linear  prediction  using 
time-varying  coe^icients.  By  representing  each  time-varying  coefficient,  either  in 
terms  of  a power  series,  or  in  terms  of  a Fourier  series,  a set  of  equations  to  deter- 
mine the  coefficients  was  obtained.  These  coefficients  are  reminiscent  of  those  in  the 
time-invariant  case  in  that  they  are  block-symmetric  or  block-Toeplitz.  The  basic 
problem  can  either  be  formulated  in  a covariance  form  or  in  a correlation  form  and  the 
relative  characteristics  of  these  two  approaches  were  explored.  Through  a study  of 
a number  of  synthetic  examples  and  examples  using  real  data,  we  concluded  that  the 
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covariance  form  with  a power  series  representation  for  the  coefficients  was  the  most 
preferable  and  that  this  approach  has  the  potential  for  representing  a long  nonstationary 
segment  of  speech  with  fewer  total  coefficients  than  would  be  required  throtigh  the  use 
of  time-invariant  LPC  in  which  an  analysis  window  is  moved  through  the  data. 

4.  DIGITAL  SEISMIC  SIGNAL  PROCESSING 

U.  S.  Navy  — Office  of  Naval  Research  (Contract  N00014-75-C-0951) 

National  Science  Foundation  (Grant  ENG76-24117) 

David  B.  Harris 

A seismic  surveying  technique  called  wave  equation  migration  is  being  investigated 
for  possible  applications  of  two-dimensional  digital  signal  processing  algorithms.  The 
key  component  of  the  migration  algorithm  is  a difference  equation  approximation  to  the 
wave  equation.  This  difference  equation  is  used  to  extrapolate  a wave  field  recorded 
on  the  boimdary  of  a region  backwards  into  the  region.  An  ideal  transfer  function  for 
the  two-dimensional  difference  equation  can  be  derived  from  the  wave  equation.  Cur- 
rently, methods  are  being  sought  to  approximate  this  transfer  function  which  is  all-pass 
with  a specified  phase. 

5.  TIME  SCALE  MODIFICATIONS  OF  SPEECH  SIGNALS 
Michael  R.  Portnoff 

The  objective  of  our  research  in  this  area  is  to  modify  a speech  signal  in  such  a man- 
ner that  the  resulting  signal  is  perceived  as  identical  to  the  original  except  for  its  rate 
of  articulation.  In  particular,  we  seek  to  preserve  such  qualities  as  naturalness,  intel- 
ligibility, and  speaker- dependent  features,  while  avoiding  the  introduction  of  such 
objectionable  artifacts  as  "glitches,"  "burbles,"  and  reverberation  often  present  in 
vocoded  speech. 

We  have  developed  and  demonstrated  a high-quality  system  for  time-scale  compres- 
sion and  expansion  of  speech  based  on  short-time  Fourier  analysis.  ’ ^ This  system  is 
capable  of  compressing  speech  by  ratios  as  large  as  3;1  and  expanding  speech  by  arbi- 
trarily large  ratios.  Furthermore,  the  performance  of  this  system  does  not  appear  to 
be  sensitive  to  the  presence  of  broadband  noise  in  the  speech  source  material. 

References 
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2.  M.  R.  Portnoff,  "A  Mathematical  Framework  for  Time-Scale  Modification  of  Speech 
Signals,"  Ninety-third  Meeting,  Acoustical  Society  of  America,  State  College,  Penn- 
sylvania, June  7-10,  1977.  (Abstract  in  J.  Acoust.  Soc.  Am.,  Vol.  61,  Suppl. 
No.  1,  Spring  1977,  p.  S68.) 

6.  ALGORITHMS  FOR  HIGHLY  PARALLEL  COMPUTER 
STRUCTURES 

National  Aeronautics  and  Space  Administration  (Grant  NSG-5157) 

James  H.  McClellan,  David  C.  LeDoux 

In  order  to  process  photographic  images  from  Earth  observation  satellites  rapidly, 
computers  with  a very  high  degree  of  parallel  processing  capability  are  being  designed. 
One  such  machine  would  be  composed  of  a minimum  of  16,  384  very  simple  processors 
arranged  in  a 128  X 128  array.  Each  processing  element  contains  a bit-serial  adder 
with  a small  amount  of  memory  and  is  able  to  transfer  data,  one  bit  at  a time,  to  any 
of  its  four  nearest  neighbors.  A central  control  unit  issues  an  instruction  to  the  entire 
array  and  the  individual  processors  decide  whether  or  not  to  execute  it,  depending  on 
a mask  bit  in  each  processor.  In  use,  each  processor  operates  on  one  pixel  in  a sam 
pled  photo  and  the  parallel  processing  allows  great  time  savings  over  more  traditional 
serial  computers, 

A frequent  operation  in  satellite  image  processing  is  that  of  image  registration  which 
involves  computing  the  cross-correlation  between  two  images  of  the  same  scene.  Cor- 
relations may  be  computed  indirectly  and  sometimes  more  efficiently  by  using  two- 
dimensional  transforms  such  as  the  Fourier  transform  or  the  Fermat  number  trans- 
form (FNT).  The  computation  of  the  FNT  does  not  involve  multiplications  or  complex 
numbers,  and  is  well  suited  to  the  simplicity  of  the  small  processors.  We  have  investi- 
gated the  use  of  the  FNT  to  perform  correlation  on  a highly  parallel  computer  developed 
by  NASA.  Programs  have  been  writen  and  timing  estimates  have  been  obtained.  The 
FNT  can  be  more  efficient  than  direct  computation  of  the  correlation  (on  this  machine), 
depending  on  the  sizes  of  the  images  used. 

We  are  now  investigating  the  applicability  of  the  parallel  computer  structure  to  sev- 
eral new  image  restoration  algorithms,  particularly  those  which  attempt  to  correct  for 
the  effects  of  the  Earth's  turbulent  atmosphere. 

References 
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